training data foci test
Query use case
Do we trust the providers/origin of all training data used - using exception list
Schemas used
Pseudo code
FUNCTION ai_system_providers_trusted_with_blacklist(AI_System_ID, Blacklist_Emails)
// Step 1: Retrieve provider UUIDs associated with the AI system
SET Provider_UUIDs = get list of providers contributing data to AI_System_ID
// Step 2: Retrieve provider email addresses
SET Provider_Emails = map provider UUIDs to their identity email addresses
// Step 3: Check if any provider email appears in the blacklist
SET Matches = intersection of Provider_Emails and Blacklist_Emails
// If there are no matches, the AI system providers are trusted
IF Matches is empty THEN
RETURN True
ELSE
RETURN False
END FUNCTION
Explanation
-
Find relevant data sources:
- Retrieve the configuration verification credential (
ConfigVcId
) for the AI system. - Extract the weights verification credential (
WeightsVcId
) used in training. - Ensure that the
WeightsVcId
is classified as"Weights"
. - Trace back to the training system that produced these weights.
- Identify the datapack used in the training process.
- Retrieve the configuration verification credential (
-
Extract the list of Data Verification Credentials (
DataVcIds
) used in training from the datapack. -
Determine the providers who contributed this data:
- For each
DataVcId
, check its attestations and extract provider UUIDs where the attestation type is"provided"
.
- For each
-
Map provider UUIDs to their email identities.
-
Check if any provider's email appears in the blacklist.
-
Return
True
only if there are no blacklisted providers (i.e., intersection is empty).
Query
ai_system_providers_trusted_with_blacklist(AiSystemId, Blacklist)
link to query- link to simulator